Automatic Category Analysis ¶

In [ ]:
import pandas as pd
In [2]:
df = pd.read_csv("googleplaystore.csv").dropna()
In [3]:
df.head(5)
Out[3]:
App Category Rating Reviews Size Installs Type Price Content Rating Genres Last Updated Current Ver Android Ver
0 Photo Editor & Candy Camera & Grid & ScrapBook ART_AND_DESIGN 4.1 159 19M 10,000+ Free 0 Everyone Art & Design January 7, 2018 1.0.0 4.0.3 and up
1 Coloring book moana ART_AND_DESIGN 3.9 967 14M 500,000+ Free 0 Everyone Art & Design;Pretend Play January 15, 2018 2.0.0 4.0.3 and up
2 U Launcher Lite – FREE Live Cool Themes, Hide ... ART_AND_DESIGN 4.7 87510 8.7M 5,000,000+ Free 0 Everyone Art & Design August 1, 2018 1.2.4 4.0.3 and up
3 Sketch - Draw & Paint ART_AND_DESIGN 4.5 215644 25M 50,000,000+ Free 0 Teen Art & Design June 8, 2018 Varies with device 4.2 and up
4 Pixel Draw - Number Art Coloring Book ART_AND_DESIGN 4.3 967 2.8M 100,000+ Free 0 Everyone Art & Design;Creativity June 20, 2018 1.1 4.4 and up

Q1. Total number of apps in each category¶

In [9]:
categories = {}

for name in df['Category'].unique():
    ct = 0
    for i in df['Category']:
        if(i == name):
            ct += 1
    categories[name] = ct
    
for i in categories:
    print(i,":" ,categories[i])
ART_AND_DESIGN : 61
AUTO_AND_VEHICLES : 73
BEAUTY : 42
BOOKS_AND_REFERENCE : 178
BUSINESS : 303
COMICS : 58
COMMUNICATION : 328
DATING : 195
EDUCATION : 155
ENTERTAINMENT : 149
EVENTS : 45
FINANCE : 323
FOOD_AND_DRINK : 109
HEALTH_AND_FITNESS : 297
HOUSE_AND_HOME : 76
LIBRARIES_AND_DEMO : 64
LIFESTYLE : 314
GAME : 1097
FAMILY : 1746
MEDICAL : 350
SOCIAL : 259
SHOPPING : 238
PHOTOGRAPHY : 317
SPORTS : 319
TRAVEL_AND_LOCAL : 226
TOOLS : 733
PERSONALIZATION : 312
PRODUCTIVITY : 351
PARENTING : 50
WEATHER : 75
VIDEO_PLAYERS : 160
NEWS_AND_MAGAZINES : 233
MAPS_AND_NAVIGATION : 124

Q2. Total number of apps in each Type¶

In [5]:
types = {}

for name in df['Type'].unique():
    ct = 0
    for i in df['Type']:
        if(i == name):
            ct += 1
    types[name] = ct
    
print(types)
{'Free': 8715, 'Paid': 645}

Q3. Total number of apps in each Content Rating¶

In [6]:
content_rating = {}

for name in df['Content Rating'].unique():
    ct = 0
    for i in df['Content Rating']:
        if(i == name):
            ct += 1
    content_rating[name] = ct
    
print(content_rating)
{'Everyone': 7414, 'Teen': 1084, 'Everyone 10+': 397, 'Mature 17+': 461, 'Adults only 18+': 3, 'Unrated': 1}
In [13]:
df['Rating'].describe()
Out[13]:
count    9360.000000
mean        4.191838
std         0.515263
min         1.000000
25%         4.000000
50%         4.300000
75%         4.500000
max         5.000000
Name: Rating, dtype: float64
In [12]:
df['Reviews'].describe()
Out[12]:
count     9360
unique    5990
top          2
freq        83
Name: Reviews, dtype: object
In [ ]: